Structured informational link annotations

ABSTRACT

Methods, systems, and apparatus include computer programs encoded on a computer-readable storage medium. A method includes: for each of a plurality of content items in an inventory of content items: identifying an entity associated with the content item and a plurality of page types for a vertical associated with a product or service described in the content item; locating a plurality of informational pages associated with the entity; classifying each informational page based on the page types; identifying queries associated with the entity, wherein a query is used as a selection criteria for delivering one or more content items associated with the entity; for each informational page of the plurality of informational pages determining relevant queries from the identified queries; and storing in a data structure an association between the content item, data associated with the relevant queries and associated informational pages.

BACKGROUND

This specification relates to information presentation.

The Internet provides access to a wide variety of resources. Forexample, video and/or audio files, as well as web pages for particularsubjects or particular news articles, are accessible over the Internet.Access to these resources presents opportunities for other content(e.g., advertisements) to be provided with the resources. For example, aweb page can include slots in which content can be presented. Theseslots can be defined in the web page or defined for presentation with aweb page, for example, along with search results.

Slots can be allocated to content sponsors through a reservation systemor an auction. For example, content sponsors can provide bids specifyingamounts that the sponsors are respectively willing to pay forpresentation of their content. In turn, a reservation can be made or anauction can be performed, and the slots can be allocated to sponsorsaccording, among other things, to their bids and/or the relevance of thesponsored content to content presented on a page hosting the slot or arequest that is received for the sponsored content.

SUMMARY

In general, one innovative aspect of the subject matter described inthis specification can be implemented in methods that include a methodfor associating content items with informational pages and queries thatare relevant to the informational pages. The method includes: for eachof a plurality of content items in an inventory of content items:identifying an entity associated with the content item and a pluralityof page types for a vertical associated with a product or servicedescribed in the content item; locating a plurality of informationalpages associated with the entity; classifying each informational pagebased on the page types; identifying queries associated with the entity,wherein a query is used as a selection criteria for delivering one ormore content items associated with the entity; for each informationalpage of the plurality of informational pages determining relevantqueries from the identified queries; and storing in a data structure anassociation between the content item, data associated with the relevantqueries and associated informational pages.

In general, another aspect of the subject matter described in thisspecification can be implemented in methods that include a method forassociating a content sponsor and a content item with associatedinformational pages. The method includes: identifying a content item andan entity associated with the content item; identifying a plurality ofinformational pages associated with the entity; determining a pluralityof page types associated with informational content that are related tothe entity; classifying each informational page based on the page types;determining which of the informational pages relate to the content itembased on one or more criteria; and storing in a database an associationbetween the content item, the content sponsor, and related informationalpages based on page type.

In general, another aspect of the subject matter described in thisspecification can be implemented in computer program products. Acomputer program product is tangibly embodied in a computer-readablestorage device and comprises instructions. The instructions, whenexecuted by a processor, cause the processor to: for each of a pluralityof content items in an inventory of content items: identify an entityassociated with the content item and a plurality of page types for avertical associated with a product or service described in the contentitem; locate a plurality of informational pages associated with theentity; classify each informational page based on the page types;identify queries associated with the entity, wherein a query is used asa selection criteria for delivering one or more content items associatedwith the entity; for each informational page of the plurality ofinformational pages determine relevant queries from the identifiedqueries; and store in a data structure an association between thecontent item, data associated with the relevant queries and associatedinformational pages.

In general, another aspect of the subject matter described in thisspecification can be implemented in systems. A system includes a linkand criteria identification system, an annotation serving system, anannotation rendering system, and a content selector. The contentselector is configured to: receive a request for content; and identify acontent item responsive to the request. The link and criteriaidentification system is configured to identify one or moreinformational pages associated with the content item based at least inpart on a content sponsor associated with the content item and termsincluded in the request and/or the landing page associated with thecontent item. The annotation serving system is configured to: generate alink for an informational page of the one or more informational pages;and augment the content item with the generated link. The annotationrendering system is configured to provide the augmented content itemresponsive to the request.

These and other implementations can each optionally include one or moreof the following features. The entity can be a content sponsor.Determining relevant queries from the identified queries can includedetermining one or more keywords for a given informational page andidentifying a relevant query can include finding one or more queriesthat include the determined one or more keywords. A request for contentcan be received and a content item can be identified responsive to therequest. One or more informational pages associated with the contentitem can be identified based at least in part on a content sponsorassociated with the content item and terms included in the requestand/or the landing page associated with the content item. A link can begenerated for an informational page of the one or more informationalpages, the content item can be augmented with the generated link and theaugmented content item can be provided responsive to the request.Identifying a plurality of page types for a vertical associated with aproduct or service described in the content item can include retrievinga set of page types from a database based on the vertical. Identifyingthe plurality of page types can include evaluating a corpus of documentsassociated with a document sponsor to determine the page types.Evaluating the corpus can include evaluating titles of documents in thecorpus including extracting n-grams from the titles that do not includea product, service or brand name associated with the content sponsor andusing the extracted n-grams to identify the page types. A determinationcan be made as to which of the identified page types to assign to acontent item by evaluating one or more of URL (Uniform Resource Locator)patterns or title n-grams. Locating a plurality of informational pagesassociated with the content sponsor can include receiving a list ofinformational pages associated with the content sponsor. Locating aplurality of informational pages associated with the content sponsor caninclude evaluating a corpus of documents associated with the contentsponsor to identify the plurality of informational pages. Locating aplurality of informational pages associated with the content sponsor caninclude identifying a set of URLs that constitute informational linksfor the page types of the plurality of page types chosen from a totalset of URLs associated with the content sponsor. The page types can beassociated with informational needs of a user selected from how-to,buying guide, reviews, product walkthrough, product gallery, customerreviews, question and answer, live chat, technical specifications,technical support, top lists, or side-by-side comparisons. Identifyingone or more informational pages can include selecting, based on one ormore criteria, one or more informational pages from the identifiedinformational pages and generating an informational link based on aselected informational page. A determination can be made as to a taskassociated with the request, wherein the task is located along a pathtoward conversion and wherein the criteria relates to furthering theuser along the path toward conversion. The generated informational linkscan be assembled in an order and the assembled informational links canbe provided along with the content item responsive to the request. Adetermination can be made as to a subset of the informational pages topresent based on one or more criteria and augmenting can includepresenting informational links associated with the subset. The criteriacan be based on a function of a task that is inferred that the user isperforming related to the request. A title for the generated link can beautomatically constructed. The title can be automatically constructedbased on one or more criteria including page type and available screenspace. Locating a plurality of informational pages associated with thecontent sponsor can include submitting a request to a search system forthe search system to locate informational pages that include one or morepage type n-grams and receiving information from the search systemidentifying the located informational pages.

Particular implementations may realize none, one or more of thefollowing advantages. Annotations can be added to a content itempresented to a user which inform the user as to informational contentavailable from a content sponsor, including informational contentlocated on a landing page associated with the content item or at otherlocations designated or associated with the content item sponsor. Theannotations can be presented in a structured, consistent manner whichcan create a predictable user experience in which the user knows what toexpect when interacting with an annotation Annotations can be providedthat help fulfill an informational need of the user Annotations can beprovided that can move a user forward in a conversion funnel and thusprovide a return on investment for the content item sponsor.

The details of one or more implementations of the subject matterdescribed in this specification are set forth in the accompanyingdrawings and the description below. Other features, aspects, andadvantages of the subject matter will become apparent from thedescription, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an example environment for providing anannotated content item to a user.

FIG. 2 is a block diagram of an example system for providing anannotated content item to a user.

FIG. 3 is a flowchart of an example process for associating contentitems with informational pages and queries that are relevant to theinformational pages.

FIG. 4 is a flowchart of an example process for providing an annotatedcontent item to a user.

FIG. 5 is a flowchart of an example process for associating a contentsponsor and a content item with associated informational pages.

FIG. 6 is a block diagram of computing devices that may be used toimplement the systems and methods described in this document, as eithera client or as a server or plurality of servers.

Like reference numbers and designations in the various drawings indicatelike elements.

DETAILED DESCRIPTION

Systems and methods presented provide structured annotations forpresentation along with content items. For example, a request forcontent can be received from a user device and a content item can beidentified in response to the request. One or more informational pagesassociated with the content item can be identified based at least inpart on a content sponsor associated with the content item and termsincluded in the request and/or on the landing page associated with thecontent item. A link can be generated for an informational page and thecontent item can be augmented with the generated link. The augmentedcontent item can be provided to the user device in response to therequest.

For situations in which the systems discussed here collect informationabout users, or may make use of information about users, the users maybe provided with an opportunity to control whether programs or featurescollect user information (e.g., information about a user's socialnetwork, social actions or activities, profession, a user's preferences,or a user's current location), or to control whether and/or how toreceive content from the content server that may be more relevant to theuser. In addition, certain data may be manipulated in one or more waysbefore it is stored or used, so that certain information about the useris removed. For example, a user's identity may be manipulated so that noidentifying information can be determined for the user, or a user'sgeographic location may be generalized where location information isobtained (such as to a city, ZIP code, or state level), so that aparticular location of a user cannot be determined. Thus, the user mayhave control over how information about the user is collected and usedby a content server.

FIG. 1 is a block diagram of an example environment 100 for providing anannotated content item to a user. The example environment 100 includes anetwork 102, such as a local area network (LAN), a wide area network(WAN), the Internet, or a combination thereof. The network 102 connectswebsites 104, user devices 106, content sponsors 108, publishers, and acontent management system 110. The example environment 100 may includemany thousands of websites 104, user devices 106, and content sponsors108. The content management system 110 may be used for selecting andproviding content in response to requests for content. The contentsponsors 108 can be, for example, advertisers. Other types of contentsponsors are possible.

A website 104 includes one or more resources 105 associated with adomain name and hosted by one or more servers. An example website 104 isa collection of web pages formatted in hypertext markup language (HTML)that can contain text, images, multimedia content, and programmingelements, such as scripts. Each website 104 can be maintained by acontent publisher, which is an entity that controls, manages and/or ownsthe website 104.

A resource 105 can be any data that can be provided over the network102. A resource 105 can be identified by a resource address that isassociated with the resource 105. Resources 105 include HTML pages, wordprocessing documents, portable document format (PDF) documents, images,video, and news feed sources, to name only a few. The resources 105 caninclude content, such as words, phrases, videos, images and sounds, thatmay include embedded information (such as meta-information hyperlinks)and/or embedded instructions (such as scripts).

A user device 106 is an electronic device that is under control of auser and is capable of requesting and receiving resources 105 over thenetwork 102. Example user devices 106 include personal computers, tabletcomputers, mobile communication devices (e.g., smartphones),televisions, set top boxes, personal digital assistants and otherdevices that can send and receive data over the network 102. A userdevice 106 typically includes one or more user applications, such as aweb browser, to facilitate the sending and receiving of data over thenetwork 102. The web browser can interact with various types of webapplications, such as a game, a map application, or an e-mailapplication, to name a few examples.

A user device 106 can request resources 105 from a website 104. In turn,data representing the resource 105 can be provided to the user device106 for presentation by the user device 106. User devices 106 can alsosubmit search queries 116 to the search system 112 over the network 102.In response to a search query 116, the search system 112 can, forexample, access an indexed cache 114 to identify resources 105 that arerelevant to the search query 116. The search system 112 identifies theresources 105 in the form of search results 118 and returns the searchresults 118 to the user devices 106 in search results pages. A searchresult 118 is data generated by the search system 112 that identifies aresource 105 that is responsive to a particular search query 116, andincludes a link to the resource 105. An example search result 118 caninclude a web page title, a snippet of text or a portion of an imageextracted from the web page, and the URL (Unified Resource Locator) ofthe web page.

The data representing the resource 105 or the search results 118 canalso include data specifying a portion of the resource 105 or searchresults 118 or a portion of a user display (e.g., a presentationlocation of a pop-up window or in a slot of a web page) in which othercontent (e.g., advertisements) can be presented. These specifiedportions of the resource or user display are referred to as slots orimpressions. An example slot is an advertisement slot.

When a resource 105 or search results 118 are requested by a user device106, the content management system 110 may receive a request for contentto be provided with the resource 105 or search results 118. The requestfor content can include characteristics of one or more slots orimpressions that are defined for the requested resource 105 or searchresults 118. For example, a reference (e.g., URL) to the resource 105 orsearch results 118 for which the slot is defined, a size of the slot,and/or media types that are available for presentation in the slot canbe provided to the content management system 110. Similarly, keywordsassociated with a requested resource (“resource keywords”) or a searchquery 116 for which search results 118 are requested can also beprovided to the content management system 110 to facilitateidentification of content that is relevant to the resource or searchquery 116.

Based, for example, on data included in the request for content, acontent selector 111 included in the content management system 110 canselect content items that are eligible to be provided in response to therequest, such as content items having characteristics matching thecharacteristics of a given slot. As another example, content itemshaving selection criteria (e.g., keywords) that match the resourcekeywords or the search query 116 may be selected as eligible contentitems by the content selector 111. One or more selected content itemscan be provided to the user device 106 in association with providing anassociated resource 105 or search results 118.

In some implementations, the content selector 111 can select contentitems based at least in part on results of an auction. For example,content sponsors 108 can provide bids specifying amounts that thecontent sponsors 108 are respectively willing to pay for presentation oftheir content items. In turn, an auction can be performed and the slotscan be allocated to content sponsors 108 according, among other things,to their bids and/or the relevance of a content item to contentpresented on a page hosting the slot or a request that is received forthe content item. For example, when a slot is being allocated in anauction, the slot can be allocated to the content sponsor 108 thatprovided the highest bid or a highest auction score (e.g., a score thatis computed as a function of a bid and/or a quality measure). Whenmultiple slots are allocated in a single auction, the slots can beallocated to a set of bidders that provided the highest bids or have thehighest auction scores.

In some implementations, some content sponsors 108 prefer that thenumber of impressions allocated to their content and the price paid forthe number of impressions be more predictable than the predictabilityprovided by an auction. For example, a content sponsor 108 can increasethe likelihood that its content receives a desired or specified numberof impressions, for example, by entering into an agreement with apublisher 109, where the agreement requires the publisher 109 to provideat least a threshold number of impressions (e.g., 1,000 impressions) fora particular content item provided by the content sponsor 108 over aspecified period (e.g., one week). In turn, the content sponsor 108,publisher 109, or both parties can provide data to the contentmanagement system 110 that enables the content management system 110 tofacilitate satisfaction of the agreement.

For example, the content sponsor 108 can upload a content item andauthorize the content management system 110 to provide the content itemin response to requests for content corresponding to the website 104 ofthe publisher 109. Similarly, the publisher 109 can provide the contentmanagement system 110 with data representing the specified time periodas well as the threshold number of impressions that the publisher 109has agreed to allocate to the content item over the specified timeperiod. Over time, the content management system 110 can select contentitems based at least in part on a goal of allocating at least a minimumnumber of impressions to a content item in order to satisfy a deliverygoal for the content item during a specified period of time.

A content sponsor 108 can create a content campaign associated with oneor more content items using tools provided by the content managementsystem 110. For example, the content management system 110 can provideone or more account management user interfaces for creating and managingcontent campaigns. The account management user interfaces can be madeavailable to the content sponsor 108, for example, either through anonline interface provided by the content management system 110 or as anaccount management software application installed and executed locallyat a content sponsor's client device.

A content sponsor 108 can, using the account management user interfaces,provide campaign parameters 120 which define a content campaign. Thecontent campaign can be created and activated for the content sponsor108 according to the parameters 120 specified by the content sponsor108. The campaign parameters 120 can be stored in a content sponsorparameters datastore 122. Campaign parameters 120 can include, forexample, a campaign name, a preferred content network for placingcontent, a budget for the campaign, start and end dates for thecampaign, a schedule for content placements, content (e.g., creatives),bids, and selection criteria. Selection criteria can include, forexample, a language, one or more geographical locations or websites,and/or one or more selection terms.

A user of a user device 106 may, for example, submit a search query 116that is a query to fulfill an informational need of a user regarding aproduct or service. For example, the user may be looking for informationabout how to use a product, how to buy a particular kind of product, acomparison between similar products, or reviews of a product. Thecontent management system 110 can annotate a content item of a contentsponsor 108 and provide the annotated content item to the user device106 to fulfill an informational need represented by such a search query116 while also moving the user along a conversion funnel to increase ROI(Return On Investment) for the content sponsor 108. An annotated contentitem, for example, with links to informational content such as a buyingguide or reviews, may be more useful for a user than a content item thatincludes products available for purchase but not other informationalcontent about the products.

A link identification system 124 can, for each content item in aninventory 126 of content items, identify a content sponsor 108associated with the content item and a set of page types for a verticalassociated with a product or service described in the content item. Thelink identification system 124 can identify the set of page types forthe vertical, for example, by querying a page types data store 128. Forexample, for a retail vertical, the pages types data store 128 mayinclude a set of predefined page types including how-to guides, buyingguides, and/or other page types. For a travel vertical, the page typesdata store 128 may include pre-defined page types of transportation,lodging, nearby-destinations, travel guide, and/or other page types. Asdescribed in more detail below, a set of page types can beprogrammatically generated.

The link identification system 124 can locate a set of informationalpages associated with the content sponsor 108, such as by using thesearch system 112. Other approaches for locating informational pages arediscussed in more detail below. The link identification system 124 canclassify each informational page based on the page types. For example, aparticular informational page can be classified as having a page type ofbuying guide. A link title synthesis system 130 can associate one ormore titles with each informational page identified by the linkidentification system 124. Identification and generation of link titlesis described in more detail below.

A criteria identification system 132 can identify selection criteria forthe informational pages identified by the link identification system124. For example, the criteria identification system 132 can identifyqueries associated with a content sponsor 108 in the content sponsorparameters datastore 122, such as queries that are used as selectioncriteria for delivering one or more content items associated with thecontent sponsor 108. The criteria identification system 132 candetermine relevant queries from the identified queries for eachinformational page associated with the content sponsor 108. For example,the criteria identification system 132 can determine one or morekeywords for a given informational page and identify relevant queries byfinding one or more queries that include the determined one or morekeywords. The criteria identification system 132 (or the linkidentification system 124) can store, in an associations datastore 134,for each content item associated with a content sponsor 108, anassociation between the content item, data associated with the relevantqueries, and informational pages associated with the content item.

When the content management system 110 receives a request for content,the content selector 111 can identify a content item responsive to therequest, as described above. An annotation serving system 136 canidentify one or more informational pages associated with the contentitem, for example, by querying the associations datastore 134 using anidentifier of the content sponsor 108 associated with the content itemand terms included in the request for content. The annotation servingsystem 136 can rank and/or filter the identified informational pages, asdescribed in more detail below, to determine a set of ranked andselected informational pages. The annotation serving system 136 cangenerate a link for each selected informational page and annotate (e.g.,augment) the content item with the generated links, for example, usinglink titles determined by the link title synthesis system 130. Theannotated content item can be provided to the requesting user device106, as illustrated by an annotated content item 140. A user device caninclude tools for rendering content delivered by the content managementsystem. In the example shown, an annotation rendering system 142 is usedto display the annotated content item 140, including generated linksand, for example, a header associated with the generated links, to theuser of the user device 106.

FIG. 2 is a block diagram of an example system 200 for providing anannotated content item to a user. A content server 202 can providecontent in response to requests for content. The content server 202 canaugment (e.g., annotate) a content item with one or more links toinformational pages that are associated with an entity (e.g., a contentsponsor) that is associated with the content item. In someimplementations, the informational pages are identified after a requestfor content is received. In some implementations, the informationalpages are identified before the request is received.

A link identification system 204 can identify, for a given contentsponsor, a set of informational links (e.g., URLs) for various pagetypes. In some implementations, the link identification system 204 canidentify page types based on a determined vertical. A vertical can bedetermined, for example, for each content sponsor, based on informationstored or known about the content sponsor. As another example, avertical can be determined for a content item based on informationincluded in or associated with the content item. The link identificationsystem 204 can query a page types datastore 206 to determine a set ofpage types associated with a vertical. For instance, as illustrated byexample data 208, the link identification system 204 can determine pagetypes of “how-to”, buying guide, reviews, product walkthrough, productgallery, and top lists for a retail vertical.

In some implementations, a set of one or more n-grams are associatedwith each page type. For example, n-grams of “buying guide”, “purchaseguide”, and “how to choose” can be associated with the page type “buyingguide”. Associations of n-grams to page types can be stored in the pagetypes data store 206. The page types data store 206 can include aversion of a page type definition, including mappings of verticals topage types and page types to associated n-grams, for each of multiplelanguages (e.g., English, German, Japanese).

In some implementations, the link identification system 204programmatically determines a set of page types, such as for aparticular content sponsor or for a group of content sponsors. Forexample, the link identification system 204 can identify a set ofproduct-related pages that include product names or keywords, for thecontent sponsor or for the group of content sponsors. For example, theset of product-related pages can be identified using a search system210, such as by querying for product names and keywords and using adomain restricted search using the domain of a given content sponsor. Asanother example, a set of product-related pages can be received by acontent sponsor. For example, a content sponsor 211 can use a campaignmanagement user interface 212 presented on a content sponsor clientdevice 214 to provide a set of links to product-related pages, asillustrated by a product-pages input included in example content sponsorinputs 216.

The link identification system 204 can process each identifiedproduct-related page. For each product-related page, the linkidentification system 204 can identify a title for the product-relatedpage, create a copy of the title, remove product names and keywords fromthe copy of the title, optionally remove stop words from the copy of thetitle, optionally remove n-grams that match a content sponsor name or asite name, and optionally perform stemming on remaining n-grams includedin the copy of the title. The link identification system 204 canidentify the remaining n-grams included in the copy of the title ascandidate page types. For instance, an example title of aproduct-related page may be “Example.com—Product Reviews for Camera Y”.The site-name n-gram of “Example.com”, the stop word “for”, and theproduct-name n-gram of “Camera Y” can be removed from a copy of theexample title, leaving the text “Product Reviews” in the copy of theexample title. The n-grams “Reviews” and “Product Reviews” can beidentified as candidate page types, for example. In someimplementations, candidate page-types are identified by processing oneor more page heading elements in a manner similar to the processing ofpage titles described above.

In some implementations, some candidate page types are weighted as beingmore or less relevant than other candidate page types. For example,candidate page types that are adjacent to a product name or keyword in aproduct-related page title can be weighted higher than candidate pagetypes that are not adjacent to a product name or keyword. The linkidentification system 204 can determine a frequency of occurrence foreach candidate page type across all of the processed product-relatedpages. The link identification system 204 can select a set of page typesfrom the candidate page types, such as by selecting page typesassociated with the top N (e.g., ten) highest frequencies of occurrenceacross product-related pages, or selecting page types having a frequencyof occurrence across product-related pages that is higher than athreshold frequency.

The link identification system 204 can identify, for a given contentsponsor, a set of informational links (e.g., URLs) associated with theselected page types. The informational links can be identified, forexample, from a set of total URLs that are associated with (e.g., owned)by the content sponsor. The set of total URLs for the content sponsorcan be identified, for example, by the search system 210. The searchsystem 210 can, for example, identify (e.g., “crawl”) the URLs that areassociated with one or more domains associated with the content sponsorand can store information on the identified URLs in an indexed cache218. The link identification system 204 can query the search system 210to determine the total set URLs for the content sponsor.

As another example, the link identification system 204 can identify thetotal set of URLs for the content sponsor by identifying a set ofpre-defined landing page URLs that are associated with the contentsponsor, such as by querying a content sponsor parameters datastore 220.As yet another example, in some implementations, the content sponsor canprovide the total set of URLs for the content sponsor, such as byproviding landing page and/or other URLs. For example, the contentsponsor 211 can use the campaign management user interface 212 toprovide the total set of URLs for the content sponsor, as illustrated bya URLs input included in the example content sponsor inputs 216.

In some implementations, the link identification system 204 identifiesinformational links from the total set of URLs for a content sponsor byanalyzing the title of each page corresponding to a URL in the total setof URLs. For example, the link identification system 204 can determineif a title includes an n-gram that is associated with a particular pagetype. In some implementations, the link identification system 204determines whether the title includes a page-type related n-gram afterstemming and stop-word removal processing. If a title includes apage-type related n-gram, the page that includes the title can beidentified as an informational page of the particular page type and theaddress associated with the page can be identified as an informationallink associated with the particular page type.

In some implementations, the link identification system 204 uses thesearch system 210 to identify informational pages associated with aparticular content sponsor. For example, the link identification system204 can request the search system 210 to perform searches that includepage-type related n-grams and a site restriction on a domain associatedwith the content sponsor. For example, the search system 210 can performa search of “‘buying guide’ site:example.com” to determine buying-guidepages associated with the content sponsor having a domain of“example.com”. As another example, the search system 210 can perform asearch of “‘buying guide’ site:example.com ‘Camera Y’” to find buyingguides for a “Camera Y” product sold by a content sponsor having adomain of “example.com”.

In some implementations, the link identification system 204 identifiesinformational links by matching patterns, such as regular expressions,to sets of URLs associated with a particular content sponsor. Forexample, the link identification system can search for and identifypatterns in URLs provided by the content sponsor, identify informationallinks based on analyzing page titles, and identify informational linksusing the search system 210, and/or other sets of URLs. For example,each URL in a set of URLs associated with a content sponsor of “example”may have a pattern of“www.example.com/products/{id}?view=comparison{otherOptionalParams},where each URL in the set is a product comparison page. If another URLwhich matches the pattern is identified, such as“www.example.com/products/2392348?view=comparison&tracking=campaignId”,the URL can be automatically identified as a product comparison page.

Other approaches can be used to identify informational pages. Forexample, content of landing pages associated with a content sponsor canbe evaluated, such as to determine whether a link included in a landingpage is an informational link (e.g., such as by identifying page-typerelated n-grams in link title text). As another example, in someimplementations a content sponsor can include metadata in pagesassociated with the content sponsor (e.g., HTML metadata), and the linkidentification system 204 can identify informational pages bydetermining which pages have metadata indicating a page is aninformational page. As yet another example, the link identificationsystem 204 can query a page classification system (not shown), where thepage classification system includes a datastore of page classifications.The page classification system may, for example, classify pages using amachine learning model.

Information can be associated with each identified informational link.For example, a link title synthesis system 221 can associate one or moretitles with each informational page identified by the linkidentification system 204. A title can be, for example, used as a linktitle in an annotation used for a content item. The link title synthesissystem 221 can determine a title for an informational page, for example,by identifying a title element associated with the informational page orby using a page-type n-gram as a link title (e.g., all buying guidepages may get a link title of “buying guide”). As another example, thelink title synthesis system 221 can receive a title for an informationalpage from a content sponsor, such as illustrated by a “titles” inputincluded in the example content sponsor inputs 216. As yet anotherexample, the link title synthesis system 221 can programmaticallygenerate a title, such as by using one or more of a content sponsorname, a page type, frequent queries associated with the informationalpage, and/or content from the informational page. For example, agenerated link title can be “56 Product Reviews from example.com”.Frequent queries are discussed in more detail below.

In some implementations, a snippet of text is associated with aninformational page, in addition to a title that is associated with theinformational page. The snippet and the title can each be included, forexample, in an annotation associated with the informational page. Asnippet can be received, for example, from a snippet generator (notshown) or from the search system 210. As another example, a snippet canbe generated from the content of the informational page, such as byidentifying a non-boilerplate portion of text included in theinformational page, such as the first N words or sentences ofnon-boilerplate text. Other approaches for generating a snippet can beused. A snippet can be included in an annotation, for example, before orafter a link associated with the informational page. For example, anannotation can include a link for a tennis racquet buying guideinformational page and can include a snippet associated with theinformational page. The annotation can appear, for example, as “TennisRacquet Buying Guide—The best racquet on the market is . . . ” or “Thebest racquet on the market is . . . Tennis Racquet Buying Guide”

Other information can be associated with each identified informationallink. For example, a criteria identification system 224 can identifyselection criteria for the informational pages identified by the linkidentification system 204. For example, the criteria identificationsystem 224 can identify queries associated with one or more contentsponsors in the content sponsor parameters datastore 220, such asqueries that are used as selection criteria for delivering one or morecontent items associated with a given content sponsor. The criteriaidentification system 224 can determine relevant queries from theidentified queries for each informational page. For example, thecriteria identification system 224 can determine one or more keywordsfor a given informational page and identify relevant queries by findingone or more queries that include the determined one or more keywords. Insome implementations, the criteria identification system 224 receivesselection criteria from a content sponsor, such as illustrated by aselection criteria input included in the example content sponsor inputs216.

In some implementations, the criteria identification system 224identifies criteria for an informational page by analyzing the title ofthe informational page. For example, for each informational page, thecriteria identification system 224 can create a copy of the title,remove page-type n-grams from the title, optionally perform stemming andstop-word removal on the copy of the title, optionally remove site namesfrom the copy of the title, and identify remaining n-grams in the copyof the title. The remaining n-grams can be identified as candidatecriteria for the informational page. For example, suppose a title of aninformational page is “example.com—Product Reviews for Camera Y”. A“Product Reviews” page-type n-gram, a stop-word “for”, and a site name“example.com” can be removed from a copy of the title, leaving the copyof the title as “Camera Y”. N-grams of “Y” and “Camera Y” can beidentified as candidate criteria. In some implementations, candidatecriteria that are adjacent to a page-type n-gram in the original titlecan be weighted higher than candidate criteria that are not adjacent toa page-type n-gram.

In some implementations, candidate criteria are identified by processingone or more page heading elements in a manner similar to the processingof page titles described above. In some implementations, candidatecriteria are identified by analyzing URLs of informational pages, suchas by identifying a particular URL pattern that is shared across a setof URLs for the content sponsor, identifying one or more portions of thepattern as corresponding to a candidate keyword, and identifyingparticular candidate keywords by identifying portions of respective URLsthat correspond to the identified URL-pattern portions.

The criteria identification system 224 can process each informationalpage for a content sponsor to determine candidate criteria for thecontent sponsor. In some implementations, the criteria identificationsystem 224 identifies a frequency of occurrence of each candidatecriteria across all of the informational pages. The criteriaidentification system 224 can, for example, select as criteria candidatecriteria that have a frequency of occurrence across the informationalpages that is greater than a threshold frequency. As another example,the criteria identification system 224 can select as criteria thecandidate criteria that have the N (e.g., ten) highest frequencies. Asyet another example, the criteria identification system 224 can selectcriteria using a formula that takes into account frequency of candidatecriteria across informational pages and whether a respective candidatecriterion is used as a selection criterion by the content sponsor for acontent item associated with the content sponsor.

In some implementations, a keyword hierarchy 226 data structure isaccessed to identify keywords to associate with informational pages. Forexample, the keyword hierarchy 226 can include multiple sets of relatedkeywords, where each set of related keywords is associated with aconcept, with keywords in the set ordered from general to specific. Forexample, a set of keywords related to a camera of a model XYZ9 from acamera maker named Camera-Maker-A can include the keywords “camera”,“digital camera”, “Camera-Maker-A camera”, “XYZ9”, and “Camera-Maker-AXYZ9”, with the keywords being ordered from more general to morespecific. The criteria identification system 224 can determine that aselected criterion for an informational page is included in a set ofrelated keywords in the keyword hierarchy 226 and can identify one ormore other keywords in the set to use as other selected criteria for theinformational page. For example, the criteria identification system 224can identify “Camera-Maker-A camera buying guide” and “XYZ buying guide”as additional selected criteria for the informational page based oncriteria “Camera-Maker-A XYZ9” being an existing criteria for theinformational page.

In some implementations, the criteria identification system 224 canidentify a set of frequent queries most often associated with aparticular candidate informational page. For example, the criteriaidentification system 224 can, for a given candidate informational page,identify queries which result in the candidate informational page beingdisplayed in a search result or that result in a content item associatedwith the candidate informational page being selected for a contentrequest that is associated with one or more identified queries. Forexample, the content item can be configured to have the candidateinformational page as a landing page. In addition to impressions ofsearch results or content items relating to the identified queries,other factors, such as a click through rate of a search result orcontent item can be used to identify and rank the frequent queries.

For example, a particular number of queries can be identified asfrequent queries for the candidate informational page. For example, arelevance score can be determined for each query associated with thecandidate informational page (e.g., based on impressions, click-throughrate) and queries having relevance scores above a threshold can beidentified or queries having a top N (e.g., five) relevance scores canbe identified. For each identified query, the link identification system204 can use the query to determine whether the associated candidateinformational page is identified as an informational page for thecontent sponsor and a page type for the informational page. For example,the queries can be evaluated in a manner similar to the evaluation oftitles and URLs described above. In general, the content system 202 canuse one or more approaches of evaluating titles, URLs, page headings,and queries to identify informational pages and page types of identifiedinformational pages.

For example, a query of “dslr camera buying guide” can be identified asa frequent query for a candidate informational page having a URL of“www.example.com/cameras?guide=1”. A page type n-gram of “buying guide”can be identified in the query of “dslr camera buying guide”, and thecandidate informational page can be identified as an informational pagehaving a page type of “buying guide”. Frequent queries can be providedto the link title synthesis system 221 and the link title synthesissystem 221 can use a frequent query to generate a title to be associatedwith an informational page. For example, if a query of “dslr camerasbuying guide” is provided to the link title synthesis system 221 for aninformational page having a URL of “www.example.com/cameras?guide=1”,the link title synthesis system 221 can generate a link title to beassociated with the informational page that is or that includes “DSLRCamera Buying Guide”.

The criteria identification system 224 (or the link identificationsystem 204) can store, in an associations datastore 228, for eachcontent item for which selection criteria and informational pages havebeen identified, an association between the content item, dataassociated with the selection criteria determined to be relevant to thecontent item, and informational pages associated with the content item.The data in the associations datastore 228 can be keyed, for example, bya content sponsor identifier and by selection criteria. For example, foreach content sponsor identifier and selection criteria combination, oneor more tuples can be stored that include a page type, an informationalpage URL, and a link title. For instance, example associations data 232keyed by multi-valued key 234 of a content sponsor id of “sponsor123”(which can be, for example, an identifier associated with the contentsponsor 211) and selection criteria of “DSLR Camera” includes tuples235, 236, and 237 for page types of buying guide, reviews, and productcomparison, respectively.

When the content system 202 receives a request for content, a contentselector 240 can select a content item to serve in response to therequest, such as identifying a content item in a content items datastore 242. For example, a request for content can be received from auser device 244 of a user 246, for a content slot 248 included in a webpage 250 displayed on the user device 244. The web page 250 is a searchresults page presented to the user 246 in response to the user 246entering a search query 251 of “DSLR camera buying guide”. The contentselector 240 can select, for example, a content item 252. The contentitem 252 can be associated, for example, with the content sponsor 211.

An annotation serving system 254 can select one or more informationalpages associated with the content item 252, for example, by querying theassociations datastore 228 using an identifier of the content sponsor211 and the search query 251. The annotation serving system 254 cangenerate a link for each selected informational page and annotate (e.g.,augment) the content item 252 with the generated links. For example, thecontent item 252 includes an annotation 256 which includes links 257,258, and 259. The links 257, 258, and 259 are associated with the tuples235, 236, and 237, respectively.

The annotation 256 can be displayed to the user using an annotationrendering system 262. In the example shown, annotation 256 includes aheader (e.g., “Info:”). Other headers can be used, such as “Learn More”.Headers can be descriptive of the informational content that isassociated with the one or more of the informational links provided orof the form of a call to action that moves the user farther along theconversion funnel. In some implementations, the annotation renderingsystem 256 selects a header based on an amount of available screenspace. For example, the annotation rendering system 256 can select aheader so that the annotation 256 occupies one line of the content item252. The annotation rendering system 256 (or the annotating servingsystem 254) can also select titles for the links 257, 258, and 259 basedon available screen space.

A generated header may include information identifying the contentsponsor and information about selection criteria. For example, supposeselection criteria associated with the content item 252 is “CameraBrandDSLR ZY200” from example.com and that informational pages identified bythe link identification system 204 are associated with criteria of “DSLRCameras”. In such an instance, a header can be “About DSLR cameras”,“From example.com”, or “More about DSLR cameras from example.com”, toname a few examples.

The annotation rendering system 256 can select titles based on how manyitems (e.g., links) are shown in an annotation. For example, if a largernumber of items are to be shown, shorter titles can be used, and if asmaller number of items are to be shown, longer titles can be used. Forexample, an annotation 264 can be included in a content item 266. Thecontent item 266 can be provided to the user device 244 in response to arequest for content for a content slot 270 included in a search resultspage 272. The annotation 264 includes items 274, 275, 276, and 277.Since the annotation 264 includes more items than the annotation 256,shorter titles may be used for the items included in the annotation 264as compared to items included in the annotation 256. For example, theitem 274, which is a link to a buying guide, has a title of “Buy.Guide”, while a corresponding link 257 has a longer title of “BuyingGuide”.

As shown in the annotation 264, some items in an annotation (e.g., items275, 276, and 277) can be “badges”, or words to flag or mark theannotation 264 and/or the content item 266. A badge can inform the userabout informational content that is available from the content sponsorassociated with the content item 266 (e.g., the content sponsor 211),such as information available from a landing page associated with thecontent item 266 (e.g., reviews, product comparison, or a productgallery). A badge, for example, may not be a link, meaning the badge maynot respond specifically to user interaction. If a user selects a badgesuch as the item 275, 276, or 277, for example, the selection action maybe interpreted as a selection of the content item 266, which can resultin a display of a landing page associated with the content item 266. Thelink 274 may, for example, be a link to the landing page associated withthe content item 266, a link to a particular section of the landingpage, or a link to a different web page associated with the contentsponsor 211. Badges to include in an annotation can be determined, forexample, by analyzing page heading elements of the landing page in amanner similar to the determination of candidate page-types based onpage heading elements, as described above.

The annotation serving system 254 may use various approaches to rankand/or filter the items included in an annotation. For example, theannotation serving system may select zero or one item (e.g., link,badge) for every page type, up to a threshold number of items (e.g.,seven), so that a diversity and balance of page types are included inthe annotation. The threshold number of items can be predefined (e.g.,seven), or can be dynamically determined based on the length ofpotential title text for candidate items that may be included in theannotation.

In some implementations, candidate items for an annotation may be rankedbased on relevance to a query associated with the request for contentfor which the content item is selected. For example, for the annotation256, the associated content item 252 may have been selected based on thesearch query 251 of “DSLR camera buying guide”. Informational linksassociated with the page type of buying guide may be ranked higher bythe annotation serving system 254 than informational pages of other pagetypes for the given vertical. In this example, one informational pageassociated with the content sponsor 211 has a page type of buying guide,but in other examples, multiple informational pages having a page typeof buying guide can be selected and included in the annotation 256,based on the buying guide page type having a higher weight than otherpage types. In some implementations, a page type can be ranked higherthan other page types if the page type corresponds to selection criteriaassociated with the content item that includes the annotation, or withother criteria, such as being associated with a user task that isinferred or a user's determined current activity or location in aconversion funnel.

The annotation serving system 254 can use other approaches for filteringor ranking candidate items considered for inclusion in an annotation.For example, the annotation serving system can select items so that eachitem is at a same or similar level of granularity, such as is maintainedin the keyword hierarchy datastore 226. For example, if the annotationserving system 254 includes in an annotation a link to a buying guidefor a particular DSLR camera, the annotation serving system 254 may alsoselect a link to a reviews page for the same DSLR camera rather thanselecting a link to a reviews page for DSLR cameras in general or forsome other camera model.

A content sponsor (e.g., the content sponsor 211) can receive reports280 which can include information related to user interaction with thecontent items that include annotations, including interactions withitems included in annotations. The reports 280 can include informationabout interactions with annotations that can be associated with variousother user interactions, such as conversion actions. In someimplementations, the content sponsor 211 can use the campaign managementuser interface 212 to register for or to opt out of inclusion ofannotations in content items associated with the content sponsor 211. Insome implementations, the content sponsor 211 can register for or optout of automatic creation and/or serving of content items whose contentincludes information about and/or links to informational pages that havebeen determined for the content sponsor by the link identificationsystem 204.

For example, a content creator 282 can automatically create a contentitem 284 based on informational content identified by the linkidentification system 204. For example, the content item 284 includes atitle 285 and a link 286, based on the link identification system 204identifying the link 284 as a link to an informational page. The contentitem 284 includes an annotation 287 that includes links for reviews,product comparison, and a product gallery, respectively.

In some implementations, the content creator 282 automatically createsthe content item 284 and presents the content item 284 to the contentsponsor 211 (e.g., on the campaign management user interface 212) as asuggestion for a new content item for the content sponsor 211 to includein a content campaign. When the content sponsor 211 approves thesuggestion, the campaign can be activated and the content item 284 canbe served in response to requests for content that are received. In someimplementations, the content item 284 is automatically created andserved in response to requests for content without specific approvalfrom the content sponsor 211. That is, the content sponsor 211 may haveregistered to have content items automatically created and servedwithout requiring specific approval of each created content item. Thecontent item 284 can be provided, for example, to a user device 288 of auser 289, for presentation in a content slot 290 included in a web page291. The content item 284 can be selected, for example, based at leastin part on the content item 284 having associated keywords that match,for example, the content of the web page 291.

In some implementations, the annotation serving system 254 createsannotations that can be included in other types of content items. Forexample, the annotation system 254 can create an annotation that can beused to augment a search result generated by the search system 210. Forexample, a search results web page 293 is displayed on a user device 294of a user 295. The search results web page 293 includes a search result296. The search result 296 includes an annotation 297 created by theannotation serving system 254 and displayed by, for example, theannotation rendering system 262. The link identification system 204 can,for example, determine informational pages associated with an entity(e.g., a publisher, a content sponsor) associated with the search result296 and the annotation serving system 254 can create the annotation 297which includes links to the identified informational pages, such aslinks to a buying guide, reviews, a product comparison, and/or a productgallery, as shown in the annotation 297. In some implementations, linksidentified by the link identification system 204 can be merged orcombined in the annotation 297 with other links (e.g., other site links)identified, for example, by the search system 210 or some other system.

Although generally described above as operations being performed, inorder, by the link identification system 204, the link title synthesissystem 221, the criteria identification system 224, the annotationserving system 254, and the annotation rendering system 262, operationscan be performed in different orders and by different systems. Thefunctionality of two or more systems can be combined into a singlesystem. For example, the link title synthesis system 221 and thecriteria identification system 224 can be combined. As another example,the criteria identification system 224 can identify a set of keywordsfor a content sponsor before the link identification system 204identifies informational pages, and the link identification system 204can use the identified keywords for locating informational pages.

FIG. 3 is a flowchart of an example process 300 for associating contentitems with informational pages and queries that are relevant to theinformational pages. The process 300 can be performed, for example, bythe content management system 110 described above with respect to FIG.1, or the content server 202 described above with respect to FIG. 2.

A determination is made as to whether there is another content item toprocess in an inventory of content items (301). For example, the process300 can be performed for each of a plurality of content items in theinventory of content items. If all of the content items have beenprocessed (e.g., there are no more content items to process), theprocess 300 ends.

If there is a content item to process, an entity associated with thecontent item and a plurality of page types for a vertical associatedwith a product or service described in the content item are identified(302). The entity can be, for example, a content sponsor and the contentitem can be, for example, an advertisement. As another example, thecontent item can be a search result and the entity (e.g., a publisher,content sponsor) can be associated with a resource that is associatedwith the search result.

A page type can be associated with an informational need of a user. Insome implementations, a set of page types is retrieved from a databasebased on the vertical. For example, the database can include apredefined set of page types for each of multiple verticals. Page typescan include, for example, for a retail vertical, how-to guide, buyingguide, reviews, product walkthrough, product gallery, customer reviews,question and answer, live chat, technical specifications, technicalsupport, top lists, or side-by-side comparisons.

In some implementations, a corpus of documents associated with adocument sponsor is evaluated to determine the page types. For example,a corpus of documents associated with a particular content sponsor orwith a set of content sponsors can be evaluated. In someimplementations, when evaluating the corpus of documents, titles ofdocuments in the corpus can be evaluated including the extraction ofn-grams from the titles that do not include a product, service or brandname associated with the content sponsor. The extracted n-grams can beused to identify the page types.

A plurality of informational pages associated with the entity arelocated (304). In some implementations, a list of informational pagesassociated with the content sponsor is received, such as from thecontent sponsor. In some implementations, a corpus of documentsassociated with the content sponsor is evaluated to identify theplurality of informational pages. For example, a set of URLs thatconstitute informational links for the page types can be chosen from atotal set of URLs associated with the content sponsor.

Each informational page is classified based on the page types (306). Insome implementations, determining which of the identified page types toassign to a content item includes evaluating URL patterns. For example,a URL pattern can be identified that is associated with a particularpage type and if the informational page matches the URL pattern, theinformational page can be determined to be of the particular page type.In some implementations, determining which of the identified page typesto assign to a content item includes evaluating title n-grams. Forexample, an informational page can be classified as a particular pagetype if the title of the informational page includes a page type n-gram(e.g., after optional stemming and stopword removal). Other approachescan be used to classify the informational page based on page type.

Queries associated with the entity are identified (308), wherein a queryis used as selection criteria for delivering one or more content itemsassociated with the entity. In some implementations, the content sponsorprovides the queries. In some implementations, query terms can bedetermined by analyzing title n-grams or URL patterns of identifiedinformational pages.

For each informational page of the plurality of informational pages,relevant queries are determined from the identified queries (310). Forexample, one or more keywords can be determined for a giveninformational page and identifying a relevant query can include findingone or more queries that include the determined one or more keywords.

An association between the content item, data associated with therelevant queries, and associated informational pages is stored in a datastructure (312). The association can be keyed, for example, by anidentifier of the content sponsor associated with the content item anddata associated with the relevant queries.

FIG. 4 is a flowchart of an example process 400 for providing anaugmented content item to a user. The process 400 can be performed, forexample, by the content management system 110 described above withrespect to FIG. 1, or the content server 202 described above withrespect to FIG. 2.

A request for content is received (402). For example, the request can bereceived from a user device for a content item to be presented in acontent slot.

A content item responsive to the request is identified (404). Forexample, an auction can be performed and a content item with a highestassociated bid or a highest auction score (e.g., a score that iscomputed as a function of a bid and/or a quality measure) can beidentified.

One or more informational pages associated with the content item areidentified (406). The one or more informational pages can be identifiedbased at least in part on a content sponsor associated with the contentitem and terms included in the request and/or the landing pageassociated with the content item.

A link is generated for an informational page of the one or moreinformational pages (408). One or more informational pages can beselected from the identified informational pages based on one or morecriteria. The criteria can be based on a function of a task that isinferred that the user is performing related to the request. The taskcan be located along a path toward conversion and the criteria canrelate to furthering the user along the path toward conversion. Aninformational link can be generated based on a selected informationalpage. In some implementations, a title can be automatically constructedfor the generated link, such as based on one or more criteria includingpage type and available screen space.

The content item is augmented with the generated link (410). Forexample, the generated link can be included in, appended to, orotherwise visually associated with the content item.

The augmented content item is provided responsive to the request (412).For example, the augmented content item can be provided to a user devicefor presentation in a content slot.

FIG. 5 is a flowchart of an example process 500 for associating acontent sponsor and a content item with associated informational pages.The process 500 can be performed, for example, by the content managementsystem 110 described above with respect to FIG. 1, or the content server202 described above with respect to FIG. 2.

A content item and an entity associated with the content item areidentified (502). The entity can be, for example, a content sponsor andthe content item can be, for example, an advertisement. As anotherexample, the content item can be a search result and the entity (e.g., apublisher, content sponsor) can be associated with a resource that isassociated with the search result.

A plurality of informational pages associated with the entity areidentified (504). In some implementations, a list of informational pagesassociated with the content sponsor is received, such as from thecontent sponsor. In some implementations, a corpus of documentsassociated with the content sponsor is evaluated to identify theplurality of informational pages. For example, a set of URLs thatconstitute informational links for the page types can be chosen from atotal set of URLs associated with the content sponsor.

A plurality of page types associated with informational content that arerelated to the entity are determined (506). In some implementations, aset of page types is retrieved from a database based on a verticalassociated with the entity. In some implementations, a corpus ofdocuments associated with the entity is evaluated to determine the pagetypes.

Each informational page is classified based on the page types (508). Insome implementations, determining which of the identified page types toassign to a content item includes evaluating URL patterns. In someimplementations, determining which of the identified page types toassign to a content item includes evaluating title n-grams.

A determination is made as to which of the informational pages relate tothe content item based on one or more criteria (510). The one or morecriteria can include selection criteria which are used to determine whento deliver the content item, content of the content item, or content ofa landing page associated with the content item.

An association between the content item, the content sponsor, andrelated informational pages based on page type is stored in a database(512). The association can be keyed, for example, by an identifier ofthe content sponsor.

FIG. 6 is a block diagram of computing devices 600, 650 that may be usedto implement the systems and methods described in this document, aseither a client or as a server or plurality of servers. Computing device600 is intended to represent various forms of digital computers, such aslaptops, desktops, workstations, personal digital assistants, servers,blade servers, mainframes, and other appropriate computers. Computingdevice 650 is intended to represent various forms of mobile devices,such as personal digital assistants, cellular telephones, smartphones,and other similar computing devices. The components shown here, theirconnections and relationships, and their functions, are meant to beillustrative only, and are not meant to limit implementations of theinventions described and/or claimed in this document.

Computing device 600 includes a processor 602, memory 604, a storagedevice 606, a high-speed interface 608 connecting to memory 604 andhigh-speed expansion ports 610, and a low speed interface 612 connectingto low speed bus 614 and storage device 606. Each of the components 602,604, 606, 608, 610, and 612, are interconnected using various busses,and may be mounted on a common motherboard or in other manners asappropriate. The processor 602 can process instructions for executionwithin the computing device 600, including instructions stored in thememory 604 or on the storage device 606 to display graphical informationfor a GUI on an external input/output device, such as display 616coupled to high speed interface 608. In other implementations, multipleprocessors and/or multiple buses may be used, as appropriate, along withmultiple memories and types of memory. Also, multiple computing devices600 may be connected, with each device providing portions of thenecessary operations (e.g., as a server bank, a group of blade servers,or a multi-processor system).

The memory 604 stores information within the computing device 600. Inone implementation, the memory 604 is a computer-readable medium. Thecomputer-readable medium is not a propagating signal. In oneimplementation, the memory 604 is a volatile memory unit or units. Inanother implementation, the memory 604 is a non-volatile memory unit orunits.

The storage device 606 is capable of providing mass storage for thecomputing device 600. In one implementation, the storage device 606 is acomputer-readable medium. In various different implementations, thestorage device 606 may be a floppy disk device, a hard disk device, anoptical disk device, or a tape device, a flash memory or other similarsolid state memory device, or an array of devices, including devices ina storage area network or other configurations. In one implementation, acomputer program product is tangibly embodied in an information carrier.The computer program product contains instructions that, when executed,perform one or more methods, such as those described above. Theinformation carrier is a computer- or machine-readable medium, such asthe memory 604, the storage device 606, or memory on processor 602.

The high speed controller 608 manages bandwidth-intensive operations forthe computing device 600, while the low speed controller 612 manageslower bandwidth-intensive operations. Such allocation of duties isillustrative only. In one implementation, the high-speed controller 608is coupled to memory 604, display 616 (e.g., through a graphicsprocessor or accelerator), and to high-speed expansion ports 610, whichmay accept various expansion cards (not shown). In the implementation,low-speed controller 612 is coupled to storage device 606 and low-speedexpansion port 614. The low-speed expansion port, which may includevarious communication ports (e.g., USB, Bluetooth, Ethernet, wirelessEthernet) may be coupled to one or more input/output devices, such as akeyboard, a pointing device, a scanner, or a networking device such as aswitch or router, e.g., through a network adapter.

The computing device 600 may be implemented in a number of differentforms, as shown in the figure. For example, it may be implemented as astandard server 620, or multiple times in a group of such servers. Itmay also be implemented as part of a rack server system 624. Inaddition, it may be implemented in a personal computer such as a laptopcomputer 622. Alternatively, components from computing device 600 may becombined with other components in a mobile device (not shown), such asdevice 650. Each of such devices may contain one or more of computingdevice 600, 650, and an entire system may be made up of multiplecomputing devices 600, 650 communicating with each other.

Computing device 650 includes a processor 652, memory 664, aninput/output device such as a display 654, a communication interface666, and a transceiver 668, among other components. The device 650 mayalso be provided with a storage device, such as a microdrive or otherdevice, to provide additional storage. Each of the components 650, 652,664, 654, 666, and 668, are interconnected using various buses, andseveral of the components may be mounted on a common motherboard or inother manners as appropriate.

The processor 652 can process instructions for execution within thecomputing device 650, including instructions stored in the memory 664.The processor may also include separate analog and digital processors.The processor may provide, for example, for coordination of the othercomponents of the device 650, such as control of user interfaces,applications run by device 650, and wireless communication by device650.

Processor 652 may communicate with a user through control interface 658and display interface 656 coupled to a display 654. The display 654 maybe, for example, a TFT LCD display or an OLED display, or otherappropriate display technology. The display interface 656 may compriseappropriate circuitry for driving the display 654 to present graphicaland other information to a user. The control interface 658 may receivecommands from a user and convert them for submission to the processor652. In addition, an external interface 662 may be provide incommunication with processor 652, so as to enable near areacommunication of device 650 with other devices. External interface 662may provide, for example, for wired communication (e.g., via a dockingprocedure) or for wireless communication (e.g., via Bluetooth or othersuch technologies).

The memory 664 stores information within the computing device 650. Inone implementation, the memory 664 is a computer-readable medium. In oneimplementation, the memory 664 is a volatile memory unit or units. Inanother implementation, the memory 664 is a non-volatile memory unit orunits. Expansion memory 674 may also be provided and connected to device650 through expansion interface 672, which may include, for example, aSIMM card interface. Such expansion memory 674 may provide extra storagespace for device 650, or may also store applications or otherinformation for device 650. Specifically, expansion memory 674 mayinclude instructions to carry out or supplement the processes describedabove, and may include secure information also. Thus, for example,expansion memory 674 may be provide as a security module for device 650,and may be programmed with instructions that permit secure use of device650. In addition, secure applications may be provided via the SIMMcards, along with additional information, such as placing identifyinginformation on the SIMM card in a non-hackable manner.

The memory may include for example, flash memory and/or MRAM memory, asdiscussed below. In one implementation, a computer program product istangibly embodied in an information carrier. The computer programproduct contains instructions that, when executed, perform one or moremethods, such as those described above. The information carrier is acomputer- or machine-readable medium, such as the memory 664, expansionmemory 674, or memory on processor 652.

Device 650 may communicate wirelessly through communication interface666, which may include digital signal processing circuitry wherenecessary. Communication interface 666 may provide for communicationsunder various modes or protocols, such as GSM voice calls, SMS, EMS, orMMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others.Such communication may occur, for example, through radio-frequencytransceiver 668. In addition, short-range communication may occur, suchas using a Bluetooth, WiFi, or other such transceiver (not shown). Inaddition, GPS receiver module 670 may provide additional wireless datato device 650, which may be used as appropriate by applications runningon device 650.

Device 650 may also communication audibly using audio codec 660, whichmay receive spoken information from a user and convert it to usabledigital information. Audio codex 660 may likewise generate audible soundfor a user, such as through a speaker, e.g., in a handset of device 650.Such sound may include sound from voice telephone calls, may includerecorded sound (e.g., voice messages, music files, etc.) and may alsoinclude sound generated by applications operating on device 650.

The computing device 650 may be implemented in a number of differentforms, as shown in the figure. For example, it may be implemented as acellular telephone 680. It may also be implemented as part of asmartphone 682, personal digital assistant, or other similar mobiledevice.

Various implementations of the systems and techniques described here canbe realized in digital electronic circuitry, integrated circuitry,specially designed ASICs (application specific integrated circuits),computer hardware, firmware, software, and/or combinations thereof.These various implementations can include implementation in one or morecomputer programs that are executable and/or interpretable on aprogrammable system including at least one programmable processor, whichmay be special or general purpose, coupled to receive data andinstructions from, and to transmit data and instructions to, a storagesystem, at least one input device, and at least one output device.

These computer programs (also known as programs, software, softwareapplications or code) include machine instructions for a programmableprocessor, and can be implemented in a high-level procedural and/orobject-oriented programming language, and/or in assembly/machinelanguage. As used herein, the terms “machine-readable medium”“computer-readable medium” refers to any computer program product,apparatus and/or device (e.g., magnetic discs, optical disks, memory,Programmable Logic Devices (PLDs)) used to provide machine instructionsand/or data to a programmable processor, including a machine-readablemedium that receives machine instructions as a machine-readable signal.The term “machine-readable signal” refers to any signal used to providemachine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniquesdescribed here can be implemented on a computer having a display device(e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor)for displaying information to the user and a keyboard and a pointingdevice (e.g., a mouse or a trackball) by which the user can provideinput to the computer. Other kinds of devices can be used to provide forinteraction with a user as well; for example, feedback provided to theuser can be any form of sensory feedback (e.g., visual feedback,auditory feedback, or tactile feedback); and input from the user can bereceived in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in acomputing system that includes a back-end component (e.g., as a dataserver), or that includes a middleware component (e.g., an applicationserver), or that includes a front-end component (e.g., a client computerhaving a graphical user interface or a Web browser through which a usercan interact with an implementation of the systems and techniquesdescribed here), or any combination of such back-end, middleware, orfront-end components. The components of the system can be interconnectedby any form or medium of digital data communication (e.g., acommunication network). Examples of communication networks include alocal area network (“LAN”), a wide area network (“WAN”), and theInternet.

The computing system can include clients and servers. A client andserver are generally remote from each other and typically interactthrough a communication network. The relationship of client and serverarises by virtue of computer programs running on the respectivecomputers and having a client-server relationship to each other.

A number of embodiments of the invention have been described.Nevertheless, it will be understood that various modifications may bemade without departing from the spirit and scope of the invention. Forexample, various forms of the flows shown above may be used, with stepsre-ordered, added, or removed. Also, although several applications ofthe payment systems and methods have been described, it should berecognized that numerous other applications are contemplated.Accordingly, other embodiments are within the scope of the followingclaims.

What is claimed is:
 1. A method comprising for each of a plurality of content items in an inventory of content items: identifying an entity associated with the content item and a plurality of page types for a vertical associated with a product or service described in the content item; locating a plurality of informational pages associated with the entity; classifying each informational page based on the page types; identifying queries associated with the entity, wherein a query is used as a selection criteria for delivering one or more content items associated with the entity; for each informational page of the plurality of informational pages determining relevant queries from the identified queries; and storing in a data structure an association between the content item, data associated with the relevant queries and associated informational pages.
 2. The method of claim 1 wherein the entity is a content sponsor.
 3. The method of claim 1 wherein determining relevant queries from the identified queries includes determining one or more keywords for a given informational page and identifying a relevant query includes finding one or more queries that include the determined one or more keywords.
 4. The method of claim 1 further comprising receiving a request for content; identifying a content item responsive to the request; identifying one or more informational pages associated with the content item based at least in part on a content sponsor associated with the content item and terms included in the request and/or the landing page associated with the content item; generating a link for an informational page of the one or more informational pages; augmenting the content item with the generated link; and providing the augmented content item responsive to the request.
 5. The method of claim 1 wherein identifying a plurality of page types for a vertical associated with a product or service described in the content item includes retrieving a set of page types from a database based on the vertical.
 6. The method of claim 1 wherein identifying the plurality of page types includes evaluating a corpus of documents associated with a document sponsor to determine the page types.
 7. The method of claim 6 wherein evaluating the corpus includes evaluating titles of documents in the corpus including extracting n-grams from the titles that do not include a product, service or brand name associated with the content sponsor and using the extracted n-grams to identify the page types.
 8. The method of claim 1 further comprising determining which of the identified page types to assign to a content item by evaluating one or more of URL (Uniform Resource Locator) patterns or title n-grams.
 9. The method of claim 2 wherein locating a plurality of informational pages associated with the content sponsor includes receiving a list of informational pages associated with the content sponsor.
 10. The method of claim 2 wherein locating a plurality of informational pages associated with the content sponsor includes evaluating a corpus of documents associated with the content sponsor to identify the plurality of informational pages.
 11. The method of claim 2 wherein locating a plurality of informational pages associated with the content sponsor includes identifying a set of URLs that constitute informational links for the page types of the plurality of page types chosen from a total set of URLs associated with the content sponsor.
 12. The method of claim 1 wherein the page types are associated with informational needs of a user selected from how-to, buying guide, reviews, product walkthrough, product gallery, customer reviews, question and answer, live chat, technical specifications, technical support, top lists, or side-by-side comparisons.
 13. The method of claim 4 wherein identifying one or more informational pages includes selecting, based on one or more criteria, one or more informational pages from the identified informational pages and generating an informational link based on a selected informational page.
 14. The method of claim 13 further comprising determining a task associated with the request, wherein the task is located along a path toward conversion and wherein the criteria relates to furthering the user along the path toward conversion.
 15. The method of claim 13 further comprising assembling the generated informational links in an order and providing the assembled informational links along with the content item responsive to the request.
 16. The method of claim 4 further comprising determining a subset of the informational pages to present based on one or more criteria and augmenting includes presenting informational links associated with the subset.
 17. The method of claim 16 wherein the criteria are based on a function of a task that is inferred that the user is performing related to the request.
 18. The method of claim 4 further comprising automatically constructing a title for the generated link.
 19. The method of claim 18 wherein the title is automatically constructed based on one or more criteria including page type and screen space available.
 20. The method of claim 2 wherein locating a plurality of informational pages associated with the content sponsor includes submitting a request to a search system for the search system to locate informational pages that include one or more page type n-grams and receiving information from the search system identifying the located informational pages.
 21. A method comprising: identifying a content item and an entity associated with the content item; identifying a plurality of informational pages associated with the entity; determining a plurality of page types associated with informational content that are related to the entity; classifying each informational page based on the page types; determining which of the informational pages relate to the content item based on one or more criteria; and storing in a database an association between the content item, the content sponsor, and related informational pages based on page type.
 22. The method of claim 21 wherein the one or more criteria include selection criteria which are used to determine when to deliver the content item, content of the content item, or content of a landing page associated with the content item.
 23. The method of claim 21 wherein the entity is a content sponsor.
 24. A system comprising: a link and criteria identification system; an annotation serving system; an annotation rendering system; and a content selector; wherein the content selector is configured to: receive a request for content; and identify a content item responsive to the request; and wherein the link and criteria identification system is configured to identify one or more informational pages associated with the content item based at least in part on a content sponsor associated with the content item and terms included in the request and/or the landing page associated with the content item; and wherein the annotation serving system is configured to: generate a link for an informational page of the one or more informational pages; and augment the content item with the generated link; and wherein the annotation rendering system is configured to provide the augmented content item responsive to the request.
 25. A computer program product tangibly embodied in a computer-readable storage device and comprising instructions that, when executed by a processor, cause the processor to: for each of a plurality of content items in an inventory of content items: identify an entity associated with the content item and a plurality of page types for a vertical associated with a product or service described in the content item; locate a plurality of informational pages associated with the entity; classify each informational page based on the page types; identify queries associated with the entity, wherein a query is used as a selection criteria for delivering one or more content items associated with the entity; for each informational page of the plurality of informational pages determine relevant queries from the identified queries; and store in a data structure an association between the content item, data associated with the relevant queries and associated informational pages. 